5  Reproducible Documents with Quarto

5.1 Overview & definitions

  • A reproducible document is a file or set of files, typically in a scientific or data-driven context, that includes both the content (text, tables, figures) & [the code or instructions required to generate and update that content]{uublue-bold}.
    • Designed to ensure that others can reproduce the same document, including its data analysis, results, and visualizations, consistently & accurately.
    • Reproducible documents are essential for transparent & verifiable research, allowing others to verify and build upon the work.
  • Literate programming is a coding & documentation approach where code and explanations are combined in a single document.
    • Emphasizes clear and understandable code by interleaving human-readable text (explanations, comments, and documentation) with executable code.
    • Fosters better communication and understanding among programmers.

5.2 Literate programming

  • Document a program by
    • Writing explanations in a natural language, interspersed with
    • Snippets of code written in a programming language
  • Two operations on this file:
    • Weaving: Creating a document incorporating the code and its results
    • Tangling: Extract the code into a script file for running
  • In the R ecosystem, this was first implemented as Sweave
    • Use LaTex to provide the mathematical & textual bits (we’ll talk about LaTex soon)
    • The code bits are in R
  • Superseded by rmarkdown and knitr
  • Superseded by Quarto in the both R and Python ecosystem
  • Integrates with Jupyter notebooks using polyglot cells

5.3 Reproducible documents

Adapting the literate programming method to creating documents

  • Put text and code in the same document, with access to the data from the code
  • Weave the document so that the results of your analysis are incorporated into the document
  • Given the same data and code, you’ll create the same document (reproducible)
  • You can look at the source file to see how the results presented are generated
  • Potentially you can use the same source to create many different kinds of outputs

The main purpose here is to create documents, slides, websites, blogs, books etc.

The code is a means to creating output for these products, rather than just documenting code

5.4 Tool-sets

Reproducible documents are a marriage of natural language text (usually q/ some markup) & a scripting language.

Toolset Natural language text Scripting language Requirement
Sweave LaTex R {{< fa brands r-project >}}
Jupyter notebooks Markdown Python, other kernels {{< fa brands python >}}
rmarkdown Markdown R, but can use Python, shell, others {{< fa brands r-project >}}
Quarto Markdown R, Python, Julia, Javascript, others Independent

There are many other toolsets as well.

5.5 Programming languages

  • Assuming you already have R, Python, and/or anaconda installed on your computer, including the standard packages.
  • you will need:
    • Python or Anaconda
    • R
    • Jupyter
    • Quarto
    • RStudio, VS Code, Antigravity
    • Google Colab can be an alternative
  • Met It will help you to install the necessary packages and libraries for the course.

5.6 Markdown

Optional content: Either cover quickly or skip, students can review at their pace. As simple as text documents.

5.6.1 Markdown overview

  • Markdown is a lightweight markup language used for formatting plain text, enabling easy conversion to HTML. Popular for web content & documentation due to simplicity.
  • Why is markdown important for programmers?
  • Documentation: Markdown simplifies code documentation & explanations.
  • Readability: Enhances code readability & comprehension w/ organized formatting.
  • Collaboration: Markdown fosters efficient collaboration through clear communication of ideas and code.
  • Version Control: Markdown files play well with version control systems like Git, tracking changes effectively.
  • README.md files are generally markdown
  • Presentation: It allows seamless integration of code, visuals, and text for clearer presentations.
  • Reproducibility: Markdown aids in replicating experiments by combining code and explanations concisely.
  • Portability: Markdown files are platform-independent and easily convertible to various formats, ensuring accessibility.
  • Time Efficiency: Quick and intuitive syntax saves time on formatting, focusing on content creation.

5.6.2 Markdown fundamentals

  • Markdown is very easy to learn!
  • Take some time to get familiar with it over the coming days
  • Common uses in Data science
    • README.md files and report generation
    • Well documented coding with .ipynb, .qmd, or .rmd
  • The following cheat-sheet summarizes much of the functionality

5.6.3 Important Markdown commands

  • Headers: Create headings with ‘#’ (e.g., ‘# Header 1’ for the largest heading).
  • Emphasis: Use ’*’ or ’_’ for italics (‘italic’) and ’**’ or ’__’ for bold (‘bold’).
  • Lists: Create ordered (1. Item) and unordered (- Item) lists.
  • Links: Insert links with [text](URL).
  • Images: Embed images with ![alt text](image URL).
  • Blockquotes: Quote text with ‘>’ (e.g., ‘> This is a quote’).
  • Inline-Code: Format inline code with backticks (e.g. `code`)
  • Code blocks: Multi-line code-blocks use triple backticks e.g. ```code```).
  • Horizontal Rule: Add a horizontal line with ‘—’ or ’___’.
  • Tables: Create tables using ‘|’ and ‘-’ (see Markdown table syntax).
  • Escape Characters: Use ‘\’ to escape special Markdown characters.

5.7 LaTex

Optional content: Either cover quickly or skip, students can review at their pace.

5.7.1 LaTex Overview

  • LaTex is a typesetting system commonly used for creating documents with complex formatting, such as research papers, theses, and academic articles.
  • Uses plain text w/ markup commands to define document structure & formatting.
  • Focuses on content & structure, allowing users to separate content from presentation.
  • Highly customizable, w/ packages & templates available for various document types.
  • LaTeX is particularly popular in academia & scientific publishing due to its support for mathematical equations and references.
  • Documents are compiled into PDF w/ LaTeX compilers like pdflatex or xelatex.
  • LaTeX is free and open-source, compatible with Windows, macOS, and Linux, and widely used in technical and scientific fields.
  • We often use markdown for document generation rather than LaTex. You DON'T have to be a LaTex expert, but, familiarity w/ writing math in LaTex is important

5.7.2 LaTex Motivation

It’s worth learning a little bit of LaTex both for its mathematical typesetting and for its strong formatting capabilities.

It is common for universities and journals and publishers to provide LaTex document classes for typesetting in their preferred manner.

5.7.3 Resources

5.7.4 LaTex

  • The strength of LaTex is in its strong separation of content and formatting.
  • The document was written in a computer science-y markup
  • One of its signature capabilities is in typesetting mathematics, leading it to be one of the main tools for authorship in STEM research.

LaTex math for undergrads [link]

LaTex math for undergrads [link]

5.7.5 Latex: basic math commands

LaTeX math commands are fundamental for expressing mathematical equations and symbols in documents.

  • $ $: Enclose mathematical expressions for inline math mode.
  • $$ $$: Enclose mathematical expressions for non-inline math mode.
  • \begin{equation} \end{equation}: Numbered equation w/ label
  • \begin{align} \end{align}: Numbered equation array w/ label
  • \frac{numerator}{denominator}: Display fractions.
  • \sqrt{...}: Add square roots.
  • \sum, \prod: Generate summation and product symbols.
  • \int: Insert integral symbols.
  • \frac{d}{dx}: Represent derivatives.
  • \lim: Display limits.
  • \infty: Represent infinity symbol.
  • \pm, \mp: Add plus-minus and minus-plus symbols.
  • \cdot, \times: Insert multiplication symbols.
  • \frac{\partial}{\partial x}: Indicate partial derivatives.
  • \left(...\right): Automatically size parentheses and brackets.
  • \alpha, \beta, etc.: Use Greek letters.
  • ^ and _: Superscripts and subscripts, e.g., x^2 \(\rightarrow x^2\) and a_{ij} \(\rightarrow a_{ij}\).

5.7.6 LaTex installation

Most reproducible document systems require an installation of LaTex on your computer or system. In particular, both RMarkdown and Quarto use it to generate PDF documents.

There are several distributions of LaTex available, including

However, there is a smaller, essential LaTex distribution available through the tinytex package, based on the TeX Live distribution

quarto install tinytex

This installation remains local to Quarto, i.e., it doesn’t affect other apps on your computer that use TeX. To make this your default TeX installation on your computer, use

quarto install tinytex --update-path

See the Quarto documentation on PDF Engines, and here for other Quarto tools.

5.8 Mathpix: A very useful tool

  • mathpix: https://mathpix.com/
  • Mathpix is a tool that uses optical character recognition to convert images of handwritten or printed math equations into digital formats, aiding quick integration of math content into documents and applications.

5.9 R-Markdown

Optional content: Either cover quickly or skip, students can review at their pace.

5.9.1 RMarkdown

  • Markdown is a text-to-HTML conversion tool meant for web writers to write HTML pages without having to deal with all the syntax
    • It allows the use of LaTex for math typesetting :wink: :wink:
  • RMarkdown creates a noweb-like system for creating Markdown documents using R
  • RMarkdown was preceded by Yihui Xie’s knitr package, & still has most of its capabilitie

This is still inspired by Sweave

  • Text is written in Markdown markup
  • Code chunks are delimited by ```{r} in line with Markdown syntax
  • A richer set of options for chunks

5.9.2 Pandoc

  • Pandoc is a “universal document converter” that has taken Markdown-based pipelines to a next level by allowing different kinds of outputs

  • Note the breadth of available formats for conversion
  • For us, the most useful are
    • HTML
    • Word/Powerpoint

5.9.3 RMarkdown document

Andrew Bray, rstudio::conf(2022)

5.10 RMarkdown

  • RMarkdown has a strong ecosystem around it due to its extensibility.
    • Documents
    • Interactive documents (including dashboards and Shiny apps)
    • Presentations
    • Books (see bookdown.org for many great R books freely available)
    • Websites
    • Journal templates (using the rticles package)

5.11 Jupyter

Optional content: Either cover quickly or skip, students can review at their pace.

5.11.1 Jupyter Lab & Notebook

JupyterLab is a web-based interactive user interface

  • Like its predecessor, the Jupyter Notebook, it runs a programming language kernel that provides the computational engine for the notebook
    • Wide variety of kernels are available, including R, Julia, SAS, Ruby, Javascript (based on nodejs), Matlab & many, many others (full list)
  • The text part of the ntebook uses Markdown formatting
  • One can produce various outputs with JupyterLab using the nbconvert Python package
    • Primary targets are static, like HTML, LaTex, PDF, Markdown, etc
  • You can convert set of Jupyter notebooks into an online book w/ jupyterbook

Source: jupyter.org

5.12 Quarto

Quarto is the next generation of RMarkdown developed by Posit

5.12.1 Quarto overview

Quarto is a modern publishing system designed for data science and technical communication, focusing on reproducibility, interactivity, and flexibility:

  • Markdown-Based: Utilizes Markdown for writing content, embedding code, and creating interactive documents.
  • Data-Driven: Supports integration of code & data to generate dynamic, up-to-date reports and documents.
  • Reproducibility: Ensures the ability to recreate documents with consistent results using source code and data.
  • Customizable Outputs: Generates various outputs like HTML, PDF, and more, allowing customization for different needs.
  • Interactive Elements: Enables creation of interactive charts, tables, and visualizations directly within documents.
  • Extensible: Offers extensibility via plugins with tools like R, Python, and Jupyter.

5.12.2 R vs. RStudio vs. Quarto

NoteIDEs and Programming Languages

  • GUI wrapper around R
  • Run blocks of R code (.qmd chunks)
Note

R Language {{< fa brands r-project >}}

  • Programming language
  • Runs scripts via ‘r script.r’ enclosed in single backticks ().

5.12.3 +

  • GUI wrapper around Python (IDE)
  • Run blocks of Python code (.ipynb cells)
library(fontawesome)
Note

Python {{< fa brands python >}}

  • Scripting language
  • On its own, just runs scripts via ‘python script.py’ enclosed in single backticks ()

5.13 RMarkdown & Quarto

rmarkdown quarto
Requires R Script-language agnostic (R or Jupyter kernel)
Mature ecosystem New kid on the block
Uses knitr for weaving Uses knitr or jupyter for weaving
Various engines for creating final document Uses pandoc exclusively for document rendering
Uses more basic markdown Allows more complex HTML structures like div blocks, tabsets, callouts, etc.
  • RMarkdown documents based on pandoc will render in quarto
    • xaringan, a popular presentation package based on RMarkdown, renders directly to remark.js and so is not compatible with Quarto
  • We can now use Jupyter notebooks as a source for Quarto, allowing the production of Python-based documents with the same fidelity as R users experience.
    • In fact, you can use any available Jupyter kernel for the source Jupyter notebook, not just Python

5.14 RMarkdown & Quarto

 

 

5.15 Single source, many outputs

We can create content (text, code, results, graphics) within a source document, and then use different weaving engines to create different document types

  • Documents
    • Web pages (HTML)
    • Word documents
    • PDF files
  • Presentations
    • HTML
    • PowerPoint
  • Websites/blogs
  • Books
  • Dashboards
  • Interactive documents
  • Formatted journal articles

Most RMarkdown documents readily render in Quarto

5.16 Installation

5.17 So we like Quarto?

  • Keeping RMarkdown for lots of things, since it still works
  • Ability to create Python-based documents (great for ML projects)
  • Love some new things for presentations

The main story is that there is nothing wrong with either RMarkdown or Jupyter per se, but Quarto enhances the feature set and gives Python a proper reproducible document tool. RMarkdown is still richer.

The major enhancement over both is in creating presentations, and the ease of constructing particular layouts

5.18 Aside: Project website and presentation

  • It is a COURSE REQUIREMENT, that you build your website and presentation with Quarto
  • It is highly recommended that use .qmd files and NOT .ipynb files for the website building.
    • However, it is easy to convert between the two formats by using the quarto convert filename command
  • Functionally the two formats are basically identical, i.e. they’re just Markdown + Code
  • However there is ONE MAJOR DIFFERENCE, i.e. .ipynb stores the code outputs in the meta-data of the file
    • This means you ONLY HAVE TO RUN THE CODE ONCE with .ipynb
    • .qmd will run the code every time you build the report, which can be very slow
      • There are caching options for .qmd to avoid this, however, they are “messier” that just using .ipynb
    • Note: If .qmd has no code, then it is basically just a Markdown file .md

5.19 Using Quarto

Tour through various important quarto constructs

5.19.1 LaTex in quarto

  • Since Quarto is built on top of Markdown, coding in Quarto should be pretty intuitive

5.19.1.1 Quarto syntax

- This is an example of a bulleted list with math 
- Here is an in-line math equation $f(x)=\frac{e^{x^2}}{2}$

$$ g(x)=x^n \rightarrow \frac{\partial g}{\partial x}=n x^{n-1} $$

$$
\begin{align}
g(x)=x^n \rightarrow \frac{\partial g}{\partial x}=n x^{n-1}\\
h(x)=\int x^n dx \rightarrow h(x)=\frac{x^{n+1}}{n+1}
\end{align}
$$

5.19.1.2 Result

  • This is an example of a bulleted list with math
  • Here is an in-line math equation \(f(x)=\frac{e^{x^2}}{2}\)

\[ g(x)=x^n \rightarrow \frac{\partial g}{\partial x}=n x^{n-1} \]

\[ \begin{align} g(x)=x^n \rightarrow \frac{\partial g}{\partial x}=n x^{n-1}\\ h(x)=\int x^n dx \rightarrow h(x)=\frac{x^{n+1}}{n+1} \end{align} \]

5.19.2 Layouts

  • You can control basic layout options with ::: {layout-ncol=2}

5.19.2.1 Quarto syntax

::: {layout-ncol=2}
### List One

- Item A
- Item B


### List Two

- Item X
- Item Y

:::

5.19.2.2 Result

5.19.2.3 List One

  • Item A
  • Item B

5.19.2.4 List Two

  • Item X
  • Item Y

5.19.3 Diagrams

Quarto has native support for embedding two popular diagram-generating tools:

```{mermaid}
%%| fig-cap: "A Mermaid diagram"
%%| code-fold: false
%%| output-location: column
flowchart LR
  A[Hard edge] --> B(Round edge)
  B --> C{Decision}
  C --> D[Result one]
  C --> E[Result two]
```

flowchart LR
  A[Hard edge] --> B(Round edge)
  B --> C{Decision}
  C --> D[Result one]
  C --> E[Result two]

A Mermaid diagram

  • You can preview Mermaid (.mmd) and Graphviz (.dot) files both in RStudio (using the DiagrammeR package) and in VS Code (using the Quarto extension)
  • For PDF and Word outputs, the diagrams are converted to PNG images using a native Chrome installation or the [Chromium] tool.

5.19.4 Creating figure layouts-1

5.19.4.1 Quarto syntax

::: {layout-ncol=2}

![](./images/ch04/paste-CC007B93.png){fig-align="left" width="3.00in"}

![](./images/ch04/paste-0860D360.png){fig-align="left" width="3.00in"}

:::

5.19.4.2 Result

5.19.5 Creating figure layouts-2

5.19.5.1 Quarto syntax

  • You can create arbitrary grid patterns for figure layouts
::: {layout="[[1], [1], [1],[1]]"}
![](./images/ch04/pdf.png){width=250}
![](./images/ch04/eqn.png){width=250}
![](./images/ch04/paste-0860D360.png){width=250}
![](./images/ch04/paste-0860D360.png){width=250}
:::

5.19.5.2 Result

5.19.6 Creating figure layouts-3

#| layout-ncol: 2
#| fig-cap:
#|   - Speed and Stopping Distances of Cars
#|   - Vapor Pressure of Mercury as a Function of Temperature

#| code-fold: false
#| vscode: {languageId: r}
#| eval: false
#| echo: true

plot(cars)
plot(pressure)

5.19.7 Code highlighting

#| layout-ncol: 2
#| label: fig-charts
#| fig-cap: Charts
#| fig-subcap:
#|   - Speed and Stopping Distances of Cars
#|   - Vapor Pressure of Mercury as a Function of Temperature

#| code-fold: false
#| code-line-numbers: '|3|4-7'
#| vscode: {languageId: r}
#| eval: false
#| echo: true

plot(cars)
plot(pressure)

5.19.8 Converting file types

  • You can switch between .qmd and .ipynb with quarto
    • quarto convert clustering.qmd this will output a .ipynb version called clustering.ipynb
    • quarto convert eda.ipynb this will output a .qmd version called eda.qmd
  • Examples:
    • quarto convert filename.ipynb
    • quarto convert filename.qmd
    • quarto convert filename.rmd
    • quarto preview filename.qmd
    • quarto preview filename.ipynb
    • quarto render filename.qmd
    • quarto render filename.ipynb

5.19.9 Embedding video

{{< video https://youtu.be/Z8t4k0Q8e8Y height="400" >}}

[See Quarto Videos for more options and details]

5.19.10 Aside: YAML overview

  • Quarto relies heavily on the YAML format
    • YAML (YAML Ain’t Markup Language): A human-readable data serialization format.
    • Indentation: Uses whitespace indentation (spaces or tabs) for structure, without relying on explicit delimiters like braces or brackets.
    • Key-Value Pairs: Represents data as key-value pairs, e.g., key: value.
    • Lists: Represents lists using hyphens, e.g., - item1, - item2.
    • Nested Structures: Supports nested data structures using indentation.
    • Comments: Allows comments with #.
    • Scalars: Represents simple data types like strings, numbers, and booleans.
    • Data Types: Supports various data types including strings, numbers, booleans, null, and dates.
    • Inclusion: Permits the use of anchors and aliases for reusing data.
    • Readability: Prioritizes human readability and is often used in configuration files and data exchange between languages.
    • No Code Execution: YAML is not intended for executing code, making it safer than some other formats for data exchange.

5.19.11 Aside: Tip for Mac users

  • command+control+shift+4 is very useful on a mac.
    • It takes a screenshot and saves it to the clip-board
  • Windows has a similar shortcut but you will have to google it
  • The following VSC extension allows you to paste images from the clip-board with alt+command+v.
  • tab is your best friend when using the command line, since it does auto-completion
  • open ./path_to_file will open any file or directory from the command line –>

5.20 Citations and Bibliographies

5.20.1 Citations

When creating documents, it is imperative that we properly cite and attribute sources we use in our document

  • This is part of the Honor Code for academic integrity
  • This is about being honest with intellectual property & giving credit where credit is due.

Keeping track of sources and citations often ends up being a big deal

5.20.2 Reference management

There are several established reference managers available, both commercial and open-source.

  • Many of these can search online resources to add references and PDF
  • Many have cloud storage allowing you to access your references from anywhere
  • They can insert citations and create bibliographies in documents

5.20.3 BibTeX

BibTeX is a tool and file format that is used to describe and process lists of references. It was developed alongside LaTex.

  • BibTeX works especially well with the Markdown/Pandoc workflow of Quarto

A typical BibTeX entry looks like this:

@article{article_key,
  author  = {Peter Adams}, 
  title   = {The title of the work},
  journal = {The name of the journal},
  year    = 1993,
  number  = 2,
  pages   = {201-213},
  month   = 7,
  note    = {An optional note}, 
  volume  = 4
}
  • The crucial piece in this entry is the article_key by which you will refer to the citation in your document
  • There are 14 key BibTeX types, but the most common are @article, @book and @unpublished

However, you don’t need to write out these entries by hand

5.20.4 Reference managers

Several reference manages are available that provide citations in BibTeX format, among other formats

We tend to use two free managers: Mendeley and Zotero

  • Both store references in a single repository
  • Both allow you to annotate PDFs and store them with the reference
  • Both can import references from web pages using browser extensions
  • Both can create a .bib file with references that can be used by Quarto
  • Both will create automatic citation keys for each reference

Zotero is one of the citation tools recommended by GU

To see some comparisons between these and a popular commercial reference manager, see here

5.20.5 Citations in Quarto

There are a few citation formats that can be used with Quarto, through pandoc.

Format File extension
BibLaTeX .bib
BibTeX .bibtex
CSL JSON .json
CSL YAML .yaml
RIS .ris
  • The RIS format is available from many reference managers, including Endnote
  • The CSL JSON format is also an available format from many reference managers

Pandoc has the ability to automatically generate citations and a bibliography from BibTeX files and embedded references (much like we replace R/Python code with their output).

Point to .bib files in the YAML header, replacing the name w/ the file-name you’re using

---
bibliography: references.bib
---

You can also specify a citation style (e.g., APA, MLA, Chicago) using the Citation Style Language,listed in the Zotero Style Repository

This can be specified in the YAML header as well

---
csl: nature.csl
---

You would need to download the .csl file from the repository and keep it in the same folder as your document.

You make a citation using the standard Pandoc/BibTeX syntax ([@citation]), where these are the citation-keys we saw before in the BibTeX entries.

  • You can use multiple citations at a time, separated by semi-colons

Pandoc will generate a bibliography and place it in the document. It will be placed in a div with the id refs if one exists, otherwise it will be place in the end of the document

### References

::: {#refs}
:::

Depending on the CSL format, the bibliography will conform to the ordering and format of that CSL specification.

Citation styles

American Statistical Association

Nature

Chicago

5.21 Supplementary material

Optional content: Either cover quickly or skip, students can review at their pace.

5.21.1 Using Quarto in RStudio

Given that Quarto is developed by Posit, the company behind RStudio, Quarto works quite seamlessly in RStudio

RStudio provides both a source editor and a visual editor experience for editing Quarto files

The visual editor in RStudio does not work really well with more complex constructs like columns, callouts and tabsets. It tends to change the source code leading to problems

At the top of the file, we put an (optional) YAML header to provide information about document output, general formatting rules and the like.

---
title: "Hello, Quarto"
format: html
editor: visual
---
  • You can provide specifications for multiple output types in the same YAML header
    • The first one written is the one rendered by default
---
title: "Hello, Quarto"
format:
   html: 
      theme: cerulean
      toc: true
      css: styles.css
   pdf: 
      toc: true
      number-sections: true
---
  • You can also specify the computational engine you want to use for the Quarto document in the YAML header (only if you’re using a non-R engine via Jupyter)
---
title: "Hello, Quarto"
format:
   html: 
      theme: cerulean
      toc: true
      css: styles.css
   pdf: 
      toc: true
      number-sections: true
jupyter: python3
---

You specify the Jupyter kernel you want to use as well

Important

Note that YAML is a hierarchical specification with hierarchy denoted by indentation. This is important to maintain, so your specification is clear. Note, for example, that the jupyter option is a top-level option, while the html and pdf specifications are within the format specification

Code is encapsulated in code chunks

5.21.2 R

#| label: load-packages-r
#| code-fold: false
#| eval: false
#| echo: true
library(tidyverse)

You can also add code within the markdown sections, by using r or python syntax

There are `​r nrow(ggplot2::mpg)` observations in our data

results in

There are r nrow(ggplot2::mpg) observations in our data

5.21.3 Python

#| label: load-packages-py
#| code-fold: false
#| eval: false

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

To create “inline” code in Python use python


You can add snippets to the RStudio configuration. Thomas Mock provides several useful Quarto snippets

You can run individual chunks in the R session

You can render Quarto documents from the menu:

or from the {{< fa brands r-project >}} Console

#| eval: false
#| echo: true
#install.packages("quarto")
quarto::quarto_render("hello.qmd")

You can use the R Console regardless of whether you’re running a {{< fa brands r-project >}} or {{< fa brands python >}} Quarto document

5.22 Using Quarto in Visual Studio Code

Quarto in VS Code works much like in RStudio, with some differences.

  • Install the Quarto VS Code Extension which provides several usability enhancements for Quarto documents
  • VS Code tends to work better when using the Jupyter engines
  • Running individual code chunks is a little different, using the Run cell button above the code cell

5.22.1 Using Quarto in Visual Studio Code

  • You can also render a document using the Preview button at the top right of the editor

  • A set of useful snippets, and instructions for installing them, are here

5.23 Using Quarto in JupyterLab

The basic Quarto document structure is retained in JupyterLab within a ipynb file

  • You create Markdown and Code cells just as you’re used to in Jupyter notebooks
  • For the YAML header, you use a “Raw” cell, rather than a Markdown cell
  • You can use YAML comments in Code cells for cell options

But rendering is different

  • Open a Terminal in JupyterLab
  • Render documents via the command line
quarto render hello.ipynb --to html
quarto render hello.ipynb --to docx

If you just want a live preview of the notebook (for development), so you can see updates and errors “on the fly”:

quarto preview hello.ipynb --to html

5.23.1 Using Quarto in JupyterLab

  • Quarto does not execute the .ipynb cells while render the notebook again
    • Cell outputs are cached in a Jupyter notebook, and so you will get what you see
    • If you want to execute the cells upon rendering, add a --execute flag
quarto render hello.ipynb --execute
  • or add to the YAML header
---
execute:
      enabled: true
---
  • You can convert between a ipynb notebook and a text-based qmd file
    • You can use the Jupytext package to maintain parallel synchronized version of .qmd and .ipynb files
quarto convert hello.ipynb # converts to qmd
quarto convert hello.qmd   # converts to ipynb 

5.23.2 Rendering Quarto documents

Important

This section is specific to our class setup and workflows and is applicable more generally

We use conda virtual environments in our workflows. These require some special treatment

  • There is a Posit-preferred way described here that requires you to have the virtual environment in the same folder as your project
  • We have found that a better option is to use the command line!!
    • Open terminal & activate the virtual environment: conda activate anly503
    • Render/preview the Quarto document from the command line
quarto render document.qmd --to html
quarto render document.ipynb --to html

Make sure you have the right path to the .qmd document relative to where you’re running the terminal

5.23.3 Multilingual documents

You can use both {{< fa brands r-project >}} and {{< fa brands python >}} in the same document!!!

  • You should use the knitr engine
  • Add the following chunk as the first chunk in your document
#| echo: true
#| eval: false

library(reticulate)
# reticulate::use_condaenv('aba688')
  • The R engine will then use the anly503 conda environment for all Python chunks.
    • It will also run all the chunks in the same Python instance, so you can pass data from one chunk to the next

The Jupyter engine does not allow multipe engines to run concurrently, so rendering both R and Python chunks in the same Jupyter document doesn’t work

5.23.4 Tangling

Sometimes it is useful to extract the actual code from the Quarto document so that it can just run without the overhead of processing the full document or noteboook. This is called tangling

  • This feature is currently not implemented in Quarto :frowning:
  • For R/knitr-based quarto documents, one can use knitr::purl
#| echo: true
#| eval: false
knitr::purl("mydocument.qmd")

will result in creating mydocument.R, where the markdown are placed in comments.

  • For Python/Jupyter, open the document in Jupyter (after converting it to .ipynb if needed) and then do one of the following:
    • Convert from the GUI, using File -> Download as -> Python (.py)
    • Use the command line: jupyter nbconvert --to python notebook.ipynb
    • Use jupytext to sync the .ipynb file with a .py file (see here)

5.24 Summary

  • Quarto is a new document creation system that is built on top of Markdown
  • It is designed to be more flexible and extensible than RMarkdown
  • It can use R, Python and Jupyter kernels
  • It can create a wide variety of document types